AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(2 more...)

Tatiana Shpakova, Francis Bach

Parameter Learning for Log-supermodular Distributions

Neural Information Processing SystemsApr-22-2026, 11:27:42 GMT

We consider log-supermodular models on binary variables, which are probabilistic models with negative log-densities which are submodular. These models provide probabilistic interpretations of common combinatorial optimization tasks such as image segmentation. In this paper, we focus primarily on parameter estimation in the models from known upper-bounds on the intractable log-partition function. We show that the bound based on separable optimization on the base polytope of the submodular function is always inferior to a bound based on "perturb-and-MAP" ideas. Then, to learn parameters, given that our approximation of the log-partition function is an expectation (over our own randomization), we use a stochastic subgradient technique to maximize a lower-bound on the log-likelihood. This can also be extended to conditional maximum likelihood. We illustrate our new results in a set of experiments in binary image denoising, where we highlight the flexibility of a probabilistic model to learn with missing data.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Country: Europe (0.46)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.36)

arXiv.org Artificial IntelligenceNov-18-2025

Quantum Machine Learning via Contrastive Training

Zhukas, Liudmila A., Zhang, Vivian Ni, Miao, Qiang, Wang, Qingfeng, Cetina, Marko, Kim, Jungsang, Carin, Lawrence, Monroe, Christopher

Quantum machine learning (QML) has attracted growing interest with the rapid parallel advances in large-scale classical machine learning and quantum technologies. Similar to classical machine learning, QML models also face challenges arising from the scarcity of labeled data, particularly as their scale and complexity increase. Here, we introduce self-supervised pretraining of quantum representations that reduces reliance on labeled data by learning invariances from unlabeled examples. We implement this paradigm on a programmable trapped-ion quantum computer, encoding images as quantum states. In situ contrastive pretraining on hardware yields a representation that, when fine-tuned, classifies image families with higher mean test accuracy and lower run-to-run variability than models trained from random initialization. Performance improvement is especially significant in regimes with limited labeled training data. We show that the learned invariances generalize beyond the pretraining image samples. Unlike prior work, our pipeline derives similarity from measured quantum overlaps and executes all training and classification stages on hardware. These results establish a label-efficient route to quantum representation learning, with direct relevance to quantum-native datasets and a clear path to larger classical inputs.

artificial intelligence, machine learning, quantum processor, (16 more...)

2511.13497

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Neural Information Processing SystemsOct-9-2025, 13:21:28 GMT

13e36f06c66134ad65f532e90d898545-Supplemental.pdf

artificial intelligence, machine learning, nu-net, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Lourenço, Vítor N., Silva, Gabriela G., Fernandes, Leandro A. F.

Hierarchy-of-Visual-Words: a Learning-based Approach for Trademark Image Retrieval

arXiv.org Artificial IntelligenceJul-30-2025

From the background, the procedure extracts the holes' shapes and associate them with the component shapes' list (lines 7 and 8). The foreground shapes are used in the next iterations (lines 5 and 9) until all component shapes have been extracted from the initial binary trademark image. Shape's feature extraction consists of building a feature vector for each component shape of a given trademark image (Figs. 1 (d) and (k)). These 29-dimension feature vectors combine region-based and contour-based descriptors. Shape's region is described by the 25 moments of the Zernike polynomials (ZM) of order p from 0 to 8: Z p,q= p + 1 π null ρ null θ V p,q(ρ,θ) I ( ρ,θ), (1) where ρ = null x 2 + y 2 is the length of vector from origin to pixel (x,y), θ is the angle between the vector defining ρ and the x -axis in the counter clockwise direction and V p,q(ρ,θ) is a Zernike polynomial of order p with repetition q that forms a complete set over the interior of the unit disk inscribing the component shape: V p,q( ρ,θ) = R p,q(ρ) exp ( i qθ) .

machine learning, pattern recognition, trademark image, (21 more...)

doi: 10.5753/sibgrapi.2019.9803

1908.02786

Country: South America > Brazil > Rio de Janeiro (0.28)

Genre: Research Report (1.00)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.93)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.93)

Saleh, Ahmed Sobhi, Croes, Kristof, Ceric, Hajdin, De Wolf, Ingrid, Zahedmanesh, Houman

Novel Concept-Oriented Synthetic Data approach for Training Generative AI-Driven Crystal Grain Analysis Using Diffusion Model

arXiv.org Artificial IntelligenceApr-22-2025

The traditional techniques for extracting polycrystalline grain structures from microscopy images, such as transmission electron microscopy (TEM) and scanning electron microscopy (SEM), are labour-intensive, subjective, and time-consuming, limiting their scalability for high-throughput analysis. In this study, we present an automated methodology integrating edge detection with generative diffusion models to effectively identify grains, eliminate noise, and connect broken segments in alignment with predicted grain boundaries. Due to the limited availability of adequate images preventing the training of deep machine learning models, a new seven-stage methodology is employed to generate synthetic TEM images for training. This concept-oriented synthetic data approach can be extended to any field of interest where the scarcity of data is a challenge. The presented model was applied to various metals with average grain sizes down to the nanoscale, producing grain morphologies from low-resolution TEM images that are comparable to those obtained from advanced and demanding experimental techniques with an average accuracy of 97.23%.

artificial intelligence, deep learning, machine learning, (19 more...)

doi: 10.1016/j.commatsci.2025.113723

2504.14782

Country:

Europe (1.00)
North America > United States > California (0.46)

Genre:

Research Report > Promising Solution (0.40)
Research Report > New Finding (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

arXiv.org Artificial IntelligenceMar-3-2025

An Approach for Air Drawing Using Background Subtraction and Contour Extraction

Acharya, Ramkrishna

--In this paper, we propose a novel approach for air drawing that uses image processing techniques to draw on the screen by moving fingers in the air . This approach benefits a wide range of applications such as sign language, in-air drawing, and'writing' in the air as a new way of input. The approach starts with preparing ROI (Region of Interest) background images by taking a running average in initial camera frames and later subtracting it from the live camera frames to get a binary mask image. We calculate the pointer's position as the top of the contour on the binary image. When drawing a circle on the canvas in that position, it simulates the drawing. Furthermore, we combine the pre-trained T esseract model for OCR purposes. T o address the false contours, we perform hand detection based on the haar cascade before performing the background subtraction. In an experimental setup, we achieved a latency of only 100ms in air drawing.

background image, background subtraction, contour, (11 more...)

2503.01497

Country:

North America > United States (0.05)
Europe > Germany (0.05)

Genre: Research Report > Promising Solution (0.35)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.74)
Information Technology > Artificial Intelligence > Vision (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.30)

Pedro Felzenszwalb, John G. Oberlin

Multiscale Fields of Patterns

Neural Information Processing SystemsFeb-8-2025, 21:53:01 GMT

We describe a framework for defining high-order image models that can be used in a variety of applications. The approach involves modeling local patterns in a multiscale representation of an image. Local properties of a coarsened image reflect non-local properties of the original image. In the case of binary images local properties are defined by the binary patterns observed over small neighborhoods around each pixel. With the multiscale representation we capture the frequency of patterns observed at different scales of resolution. This framework leads to expressive priors that depend on a relatively small number of parameters. For inference and learning we use an MCMC method for block sampling with very large blocks. We evaluate the approach with two example applications.

artificial intelligence, fop model, machine learning, (18 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

arXiv.org Artificial IntelligenceOct-13-2024

Multi class activity classification in videos using Motion History Image generation

Gopal, Senthilkumar

Human action recognition has been a topic of interest across multiple fields ranging from security to entertainment systems. Tracking the motion and identifying the action being performed on a real time basis is necessary for critical security systems. In entertainment, especially gaming, the need for immediate responses for actions and gestures are paramount for the success of that system. We show that Motion History image has been a well established framework to capture the temporal and activity information in multi dimensional detail enabling various usecases including classification. We utilize MHI to produce sample data to train a classifier and demonstrate its effectiveness for action classification across six different activities in a single multi-action video. We analyze the classifier performance and identify usecases where MHI struggles to generate the appropriate activity image and discuss mechanisms and future work to overcome those limitations.

artificial intelligence, machine learning, recognition, (15 more...)

2410.09902

Country: North America > United States > Massachusetts (0.04)

Genre:

Research Report (0.51)
Instructional Material (0.50)

Industry: Education (0.51)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Chaudhary, Kundan, Shaar, Subhei, Muthinti, Raja

Deep learning for fast segmentation and critical dimension metrology & characterization enabling AR/VR design and fabrication

arXiv.org Artificial IntelligenceSep-20-2024

Quantitative analysis of microscopy images is essential in the design and fabrication of components used in augmented reality/virtual reality (AR/VR) modules. However, segmenting regions of interest (ROIs) from these complex images and extracting critical dimensions (CDs) requires novel techniques, such as deep learning models which are key for actionable decisions on process, material and device optimization. In this study, we report on the fine-tuning of a pre-trained'Segment Anything Model' (SAM) using a diverse dataset of electron microscopy images. We employed methods such as low-rank adaptation (LoRA) to reduce training time and enhance the accuracy of ROI extraction. The model's ability to generalize to unseen images facilitates zero-shot learning and supports a CD extraction model that precisely extracts CDs from the segmented ROIs. We demonstrate the accurate extraction of binary images from cross-sectional images of surface relief gratings (SRGs) and Fresnel lenses in both single and multiclass modes. Furthermore, these binary images are used to identify transition points, aiding in the extraction of relevant CDs. The combined use of the fine-tuned segmentation model and the CD extraction model offers substantial advantages to various industrial applications by enhancing analytical capabilities, time to data and insights, and optimizing manufacturing processes.

ellipse, extraction, segmentation, (14 more...)

2409.13951

Country: North America > United States > Washington > King County > Redmond (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)